A basis representation of constrained MLLR transforms for robust adaptation

نویسندگان

  • Daniel Povey
  • Kaisheng Yao
چکیده

Constrained Maximum Likelihood Linear Regression (CMLLR) is a speaker adaptation method for speech recognition that can be realized as a featurespace transformation. In its original form it does not work well when the amount of speech available for adaptation is less than about five seconds, because of the difficulty of robustly estimating the parameters of the transformation matrix. In this paper we describe a basis representation of the CMLLR transformation matrix, in which the variation between speakers is concentrated in the leading coefficients. When adapting to a speaker, we can select a variable number of coefficients to estimate depending on the amount of adaptation data available, and assign a zero value to the remaining coefficients. We obtain improved performance when the amount of adaptation data is limited, while retaining the same asymptotic performance as conventional CMLLR. We demonstrate that our method performs better than the popular existing approaches, and is more efficient than conventional CMLLR estimation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Beyond linear transforms: efficient non-linear dynamic adaptation for noise robust speech recognition

In this paper, we present new theory and results that combine constrained Maximum Likelihood Linear Regression (MLLR), known as feature space MLLR (fMLLR), a state-of-the-art model adaptation technique, with Dynamic Noise Adaptation (DNA), a state-of-the-art noise adaptation algorithm. We explain how DNA implements a highly non-linear transform on speech model features, and why DNA is better su...

متن کامل

Geometric constrained maximum likelihood linear regression on Mandarin dialect adaptation

This paper presents a geometric constrained transformation approach for fast acoustic adaptation, which improves the modeling resolution of the conventional Maximum Likelihood Linear Regression (MLLR). For this approach, the underlying geometry difference between the seed and the target spaces is exposed and quantified, and used as a prior knowledge to reconstruct refiner transforms. Ignoring d...

متن کامل

Discriminative Adaptive Training Using the Mpe Criterion

This paper addresses the use of discriminative training criteria for Speaker Adaptive Training (SAT), where both the transform generation and model parameter estimation are estimated using the Minimum Phone Error (MPE) criterion. In a similar fashion to the use of I-smoothing for standard MPE training, a smoothing technique is introduced to avoid over-training when optimizing MPEbased feature-s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Speech & Language

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2012